Overview

Dataset statistics

Number of variables30
Number of observations100.000
Missing cells871.235
Missing cells (%)29.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory22.2 MiB
Average record size in memory233.0 B

Variable types

CAT20
NUM7
BOOL3

Warnings

crash_date has a high cardinality: 551 distinct values High cardinality
crash_time has a high cardinality: 1440 distinct values High cardinality
location has a high cardinality: 44605 distinct values High cardinality
on_street_name has a high cardinality: 4327 distinct values High cardinality
off_street_name has a high cardinality: 4897 distinct values High cardinality
cross_street_name has a high cardinality: 22829 distinct values High cardinality
contributing_factor_vehicle_1 has a high cardinality: 54 distinct values High cardinality
vehicle_type_code1 has a high cardinality: 366 distinct values High cardinality
vehicle_type_code2 has a high cardinality: 385 distinct values High cardinality
vehicle_type_code_3 has a high cardinality: 64 distinct values High cardinality
longitude is highly correlated with latitudeHigh correlation
latitude is highly correlated with longitudeHigh correlation
number_of_motorist_injured is highly correlated with number_of_persons_injuredHigh correlation
number_of_persons_injured is highly correlated with number_of_motorist_injuredHigh correlation
borough has 35026 (35.0%) missing values Missing
zip_code has 35034 (35.0%) missing values Missing
latitude has 8035 (8.0%) missing values Missing
longitude has 8035 (8.0%) missing values Missing
location has 8035 (8.0%) missing values Missing
on_street_name has 26009 (26.0%) missing values Missing
off_street_name has 52875 (52.9%) missing values Missing
cross_street_name has 74033 (74.0%) missing values Missing
contributing_factor_vehicle_2 has 19243 (19.2%) missing values Missing
contributing_factor_vehicle_3 has 91239 (91.2%) missing values Missing
contributing_factor_vehicle_4 has 97760 (97.8%) missing values Missing
contributing_factor_vehicle_5 has 99333 (99.3%) missing values Missing
vehicle_type_code2 has 26589 (26.6%) missing values Missing
vehicle_type_code_3 has 91671 (91.7%) missing values Missing
vehicle_type_code_4 has 97853 (97.9%) missing values Missing
vehicle_type_code_5 has 99354 (99.4%) missing values Missing
latitude is highly skewed (γ1 = -23.18863039) Skewed
cross_street_name is uniformly distributed Uniform
collision_id has unique values Unique
number_of_persons_injured has 72699 (72.7%) zeros Zeros
number_of_pedestrians_injured has 95454 (95.5%) zeros Zeros
number_of_motorist_injured has 81887 (81.9%) zeros Zeros

Reproduction

Analysis started2020-12-09 11:05:13.959294
Analysis finished2020-12-09 11:05:41.660428
Duration27.7 seconds
Software versionpandas-profiling v2.9.0
Download configurationconfig.yaml

Variables

crash_date
Categorical

HIGH CARDINALITY

Distinct551
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
2019-07-19T00:00:00.000
 
664
2019-07-16T00:00:00.000
 
654
2019-07-26T00:00:00.000
 
648
2019-09-03T00:00:00.000
 
648
2019-07-29T00:00:00.000
 
646
Other values (546)
96740 
ValueCountFrequency (%) 
2019-07-19T00:00:00.0006640.7%
 
2019-07-16T00:00:00.0006540.7%
 
2019-07-26T00:00:00.0006480.6%
 
2019-09-03T00:00:00.0006480.6%
 
2019-07-29T00:00:00.0006460.6%
 
2019-08-08T00:00:00.0006450.6%
 
2019-08-09T00:00:00.0006420.6%
 
2019-07-22T00:00:00.0006370.6%
 
2019-07-15T00:00:00.0006360.6%
 
2019-07-30T00:00:00.0006340.6%
 
Other values (541)9354693.5%
 
2020-12-09T12:05:41.793036image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique90 ?
Unique (%)0.1%
2020-12-09T12:05:41.981841image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length23
Median length23
Mean length23
Min length23

crash_time
Categorical

HIGH CARDINALITY

Distinct1440
Distinct (%)1.4%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0:00
 
1637
17:00
 
1363
16:00
 
1360
14:00
 
1298
15:00
 
1246
Other values (1435)
93096 
ValueCountFrequency (%) 
0:0016371.6%
 
17:0013631.4%
 
16:0013601.4%
 
14:0012981.3%
 
15:0012461.2%
 
18:0012311.2%
 
13:0011531.2%
 
12:0011031.1%
 
19:009961.0%
 
10:009711.0%
 
Other values (1430)8764287.6%
 
2020-12-09T12:05:42.170615image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-09T12:05:42.471746image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length5
Median length5
Mean length4.74399
Min length4

borough
Categorical

MISSING

Distinct5
Distinct (%)< 0.1%
Missing35026
Missing (%)35.0%
Memory size781.2 KiB
BROOKLYN
22118 
QUEENS
18322 
BRONX
11927 
MANHATTAN
10637 
STATEN ISLAND
 
1970
ValueCountFrequency (%) 
BROOKLYN2211822.1%
 
QUEENS1832218.3%
 
BRONX1192711.9%
 
MANHATTAN1063710.6%
 
STATEN ISLAND19702.0%
 
(Missing)3502635.0%
 
2020-12-09T12:05:42.647898image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-09T12:05:42.796316image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:42.960211image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length13
Median length6
Mean length5.72932
Min length3

zip_code
Real number (ℝ≥0)

MISSING

Distinct203
Distinct (%)0.3%
Missing35034
Missing (%)35.0%
Infinite0
Infinite (%)0.0%
Mean10901.65319
Minimum10000
Maximum11697
Zeros0
Zeros (%)0.0%
Memory size781.2 KiB
2020-12-09T12:05:43.141407image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum10000
5-th percentile10013
Q110457
median11209
Q311354
95-th percentile11432
Maximum11697
Range1697
Interquartile range (IQR)897

Descriptive statistics

Standard deviation523.4949054
Coefficient of variation (CV)0.04801977245
Kurtosis-1.261718063
Mean10901.65319
Median Absolute Deviation (MAD)203
Skewness-0.5985244273
Sum708236801
Variance274046.916
MonotocityNot monotonic
2020-12-09T12:05:43.331439image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
1120715101.5%
 
1123611751.2%
 
1121210881.1%
 
1120810711.1%
 
1138510151.0%
 
112039881.0%
 
114349601.0%
 
112349310.9%
 
112269230.9%
 
113688640.9%
 
Other values (193)5444154.4%
 
(Missing)3503435.0%
 
ValueCountFrequency (%) 
1000019< 0.1%
 
100014600.5%
 
100026390.6%
 
100033090.3%
 
10004880.1%
 
ValueCountFrequency (%) 
1169711< 0.1%
 
116951< 0.1%
 
116941230.1%
 
116931040.1%
 
116921230.1%
 

latitude
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED

Distinct33675
Distinct (%)36.6%
Missing8035
Missing (%)8.0%
Infinite0
Infinite (%)0.0%
Mean40.65191698
Minimum0
Maximum40.91217
Zeros169
Zeros (%)0.2%
Memory size781.2 KiB
2020-12-09T12:05:43.555323image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile40.60009
Q140.667915
median40.717724
Q340.785595
95-th percentile40.864254
Maximum40.91217
Range40.91217
Interquartile range (IQR)0.11768

Descriptive statistics

Standard deviation1.746142914
Coefficient of variation (CV)0.04295351963
Kurtosis536.8860631
Mean40.65191698
Median Absolute Deviation (MAD)0.053046
Skewness-23.18863039
Sum3738553.545
Variance3.049015076
MonotocityNot monotonic
2020-12-09T12:05:43.739673image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01690.2%
 
40.861862790.1%
 
40.8047560.1%
 
40.820305520.1%
 
40.69603348< 0.1%
 
40.67573548< 0.1%
 
40.65857747< 0.1%
 
40.73778545< 0.1%
 
40.65186343< 0.1%
 
40.6596542< 0.1%
 
Other values (33665)9133691.3%
 
(Missing)80358.0%
 
ValueCountFrequency (%) 
01690.2%
 
40.5014651< 0.1%
 
40.503311< 0.1%
 
40.5033871< 0.1%
 
40.5034141< 0.1%
 
ValueCountFrequency (%) 
40.912171< 0.1%
 
40.9121171< 0.1%
 
40.9120181< 0.1%
 
40.910381< 0.1%
 
40.910322< 0.1%
 

longitude
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct26494
Distinct (%)28.8%
Missing8035
Missing (%)8.0%
Infinite0
Infinite (%)0.0%
Mean-73.78199499
Minimum-201.23706
Maximum0
Zeros169
Zeros (%)0.2%
Memory size781.2 KiB
2020-12-09T12:05:43.947306image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum-201.23706
5-th percentile-74.015508
Q1-73.96087
median-73.91811
Q3-73.86286
95-th percentile-73.761
Maximum0
Range201.23706
Interquartile range (IQR)0.09801

Descriptive statistics

Standard deviation3.276307216
Coefficient of variation (CV)-0.04440524028
Kurtosis569.2955403
Mean-73.78199499
Median Absolute Deviation (MAD)0.04826
Skewness18.42731855
Sum-6785361.169
Variance10.73418897
MonotocityNot monotonic
2020-12-09T12:05:44.192114image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
01690.2%
 
-73.91282830.1%
 
-73.89063730.1%
 
-73.91243610.1%
 
-73.89083580.1%
 
-73.89686530.1%
 
-73.98453520.1%
 
-73.9375547< 0.1%
 
-73.9619146< 0.1%
 
-73.8653646< 0.1%
 
Other values (26484)9127791.3%
 
(Missing)80358.0%
 
ValueCountFrequency (%) 
-201.237064< 0.1%
 
-74.2530061< 0.1%
 
-74.2508241< 0.1%
 
-74.250761< 0.1%
 
-74.250151< 0.1%
 
ValueCountFrequency (%) 
01690.2%
 
-73.7005841< 0.1%
 
-73.700731< 0.1%
 
-73.700991< 0.1%
 
-73.7010041< 0.1%
 

location
Categorical

HIGH CARDINALITY
MISSING

Distinct44605
Distinct (%)48.5%
Missing8035
Missing (%)8.0%
Memory size781.2 KiB
(0.0, 0.0)
 
169
(40.861862, -73.91282)
 
79
(40.8047, -73.91243)
 
55
(40.820305, -73.89083)
 
52
(40.696033, -73.98453)
 
48
Other values (44600)
91562 
ValueCountFrequency (%) 
(0.0, 0.0)1690.2%
 
(40.861862, -73.91282)790.1%
 
(40.8047, -73.91243)550.1%
 
(40.820305, -73.89083)520.1%
 
(40.696033, -73.98453)48< 0.1%
 
(40.675735, -73.89686)48< 0.1%
 
(40.658577, -73.89063)47< 0.1%
 
(40.737785, -73.93496)43< 0.1%
 
(40.733536, -73.87035)41< 0.1%
 
(40.66496, -73.82226)40< 0.1%
 
Other values (44595)9134391.3%
 
(Missing)80358.0%
 
2020-12-09T12:05:44.552784image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique29003 ?
Unique (%)31.5%
2020-12-09T12:05:44.760428image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length25
Median length22
Mean length20.20682
Min length3

on_street_name
Categorical

HIGH CARDINALITY
MISSING

Distinct4327
Distinct (%)5.8%
Missing26009
Missing (%)26.0%
Memory size781.2 KiB
BELT PARKWAY
 
1616
LONG ISLAND EXPRESSWAY
 
1053
BROOKLYN QUEENS EXPRESSWAY
 
956
BROADWAY
 
863
FDR DRIVE
 
852
Other values (4322)
68651 
ValueCountFrequency (%) 
BELT PARKWAY 16161.6%
 
LONG ISLAND EXPRESSWAY 10531.1%
 
BROOKLYN QUEENS EXPRESSWAY 9561.0%
 
BROADWAY 8630.9%
 
FDR DRIVE 8520.9%
 
GRAND CENTRAL PKWY 8200.8%
 
ATLANTIC AVENUE 7170.7%
 
MAJOR DEEGAN EXPRESSWAY 6740.7%
 
CROSS BRONX EXPY 6520.7%
 
CROSS ISLAND PARKWAY 6050.6%
 
Other values (4317)6518365.2%
 
(Missing)2600926.0%
 
2020-12-09T12:05:44.988589image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1369 ?
Unique (%)1.9%
2020-12-09T12:05:45.198112image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length32
Median length32
Mean length24.45739
Min length3

off_street_name
Categorical

HIGH CARDINALITY
MISSING

Distinct4897
Distinct (%)10.4%
Missing52875
Missing (%)52.9%
Memory size781.2 KiB
3 AVENUE
 
432
BROADWAY
 
424
2 AVENUE
 
340
LINDEN BOULEVARD
 
280
5 AVENUE
 
247
Other values (4892)
45402 
ValueCountFrequency (%) 
3 AVENUE4320.4%
 
BROADWAY4240.4%
 
2 AVENUE3400.3%
 
LINDEN BOULEVARD2800.3%
 
5 AVENUE2470.2%
 
ATLANTIC AVENUE2400.2%
 
1 AVENUE2370.2%
 
7 AVENUE2290.2%
 
PARK AVENUE2220.2%
 
QUEENS BOULEVARD2180.2%
 
Other values (4887)4425644.3%
 
(Missing)5287552.9%
 
2020-12-09T12:05:45.402323image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1704 ?
Unique (%)3.6%
2020-12-09T12:05:45.735886image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length32
Median length3
Mean length7.80378
Min length1

cross_street_name
Categorical

HIGH CARDINALITY
MISSING
UNIFORM

Distinct22829
Distinct (%)87.9%
Missing74033
Missing (%)74.0%
Memory size781.2 KiB
772 EDGEWATER ROAD
 
35
501 GATEWAY DRIVE
 
21
90-15 QUEENS BOULEVARD
 
19
123-01 ROOSEVELT AVENUE
 
18
2100 BARTOW AVENUE
 
14
Other values (22824)
25860 
ValueCountFrequency (%) 
772 EDGEWATER ROAD 35< 0.1%
 
501 GATEWAY DRIVE 21< 0.1%
 
90-15 QUEENS BOULEVARD 19< 0.1%
 
123-01 ROOSEVELT AVENUE 18< 0.1%
 
2100 BARTOW AVENUE 14< 0.1%
 
985 RICHMOND AVENUE 13< 0.1%
 
815 HUTCHINSON RIVER PARKWAY 12< 0.1%
 
135-05 20 AVENUE 12< 0.1%
 
355 FOOD CENTER DRIVE 12< 0.1%
 
1 ORCHARD BEACH ROAD 12< 0.1%
 
Other values (22819)2579925.8%
 
(Missing)7403374.0%
 
2020-12-09T12:05:46.011509image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique20794 ?
Unique (%)80.1%
2020-12-09T12:05:46.225513image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length40
Median length3
Mean length12.60779
Min length3

number_of_persons_injured
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.37196
Minimum0
Maximum15
Zeros72699
Zeros (%)72.7%
Memory size781.2 KiB
2020-12-09T12:05:46.379905image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum15
Range15
Interquartile range (IQR)1

Descriptive statistics

Standard deviation0.7439161865
Coefficient of variation (CV)1.999989748
Kurtosis16.5808926
Mean0.37196
Median Absolute Deviation (MAD)0
Skewness3.147118256
Sum37196
Variance0.5534112925
MonotocityNot monotonic
2020-12-09T12:05:46.525703image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%) 
07269972.7%
 
12101121.0%
 
241254.1%
 
313081.3%
 
45230.5%
 
51960.2%
 
6770.1%
 
736< 0.1%
 
814< 0.1%
 
95< 0.1%
 
Other values (3)6< 0.1%
 
ValueCountFrequency (%) 
07269972.7%
 
12101121.0%
 
241254.1%
 
313081.3%
 
45230.5%
 
ValueCountFrequency (%) 
151< 0.1%
 
113< 0.1%
 
102< 0.1%
 
95< 0.1%
 
814< 0.1%
 
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
99816 
1
 
176
2
 
7
3
 
1
ValueCountFrequency (%) 
09981699.8%
 
11760.2%
 
27< 0.1%
 
31< 0.1%
 
2020-12-09T12:05:46.693500image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
2020-12-09T12:05:46.805587image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:46.935927image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

number_of_pedestrians_injured
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.04739
Minimum0
Maximum6
Zeros95454
Zeros (%)95.5%
Memory size781.2 KiB
2020-12-09T12:05:47.071358image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum6
Range6
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2234383296
Coefficient of variation (CV)4.714883512
Kurtosis38.41899762
Mean0.04739
Median Absolute Deviation (MAD)0
Skewness5.270026474
Sum4739
Variance0.04992468715
MonotocityNot monotonic
2020-12-09T12:05:47.206658image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%) 
09545495.5%
 
143834.4%
 
21420.1%
 
317< 0.1%
 
62< 0.1%
 
51< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
09545495.5%
 
143834.4%
 
21420.1%
 
317< 0.1%
 
41< 0.1%
 
ValueCountFrequency (%) 
62< 0.1%
 
51< 0.1%
 
41< 0.1%
 
317< 0.1%
 
21420.1%
 
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
99936 
1
 
64
ValueCountFrequency (%) 
09993699.9%
 
1640.1%
 
2020-12-09T12:05:47.318394image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
95147 
1
 
4744
2
 
107
3
 
2
ValueCountFrequency (%) 
09514795.1%
 
147444.7%
 
21070.1%
 
32< 0.1%
 
2020-12-09T12:05:47.430752image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique0 ?
Unique (%)0.0%
2020-12-09T12:05:47.543761image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:47.679799image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
99975 
1
 
25
ValueCountFrequency (%) 
099975> 99.9%
 
125< 0.1%
 
2020-12-09T12:05:47.784465image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

number_of_motorist_injured
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct13
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.27492
Minimum0
Maximum15
Zeros81887
Zeros (%)81.9%
Memory size781.2 KiB
2020-12-09T12:05:47.885187image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum15
Range15
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.711058401
Coefficient of variation (CV)2.586419326
Kurtosis21.76181987
Mean0.27492
Median Absolute Deviation (MAD)0
Skewness3.819224309
Sum27492
Variance0.5056040496
MonotocityNot monotonic
2020-12-09T12:05:48.032406image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=13)
ValueCountFrequency (%) 
08188781.9%
 
11224312.2%
 
237673.8%
 
312591.3%
 
45230.5%
 
51890.2%
 
6730.1%
 
734< 0.1%
 
814< 0.1%
 
95< 0.1%
 
Other values (3)6< 0.1%
 
ValueCountFrequency (%) 
08188781.9%
 
11224312.2%
 
237673.8%
 
312591.3%
 
45230.5%
 
ValueCountFrequency (%) 
151< 0.1%
 
113< 0.1%
 
102< 0.1%
 
95< 0.1%
 
814< 0.1%
 
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size781.2 KiB
0
99904 
1
 
89
2
 
6
3
 
1
ValueCountFrequency (%) 
09990499.9%
 
1890.1%
 
26< 0.1%
 
31< 0.1%
 
2020-12-09T12:05:48.200341image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique1 ?
Unique (%)< 0.1%
2020-12-09T12:05:48.313083image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:48.444046image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length1
Median length1
Mean length1
Min length1

contributing_factor_vehicle_1
Categorical

HIGH CARDINALITY

Distinct54
Distinct (%)0.1%
Missing371
Missing (%)0.4%
Memory size781.2 KiB
Driver Inattention/Distraction
25605 
Unspecified
25253 
Following Too Closely
7530 
Failure to Yield Right-of-Way
6023 
Backing Unsafely
4033 
Other values (49)
31185 
ValueCountFrequency (%) 
Driver Inattention/Distraction2560525.6%
 
Unspecified2525325.3%
 
Following Too Closely75307.5%
 
Failure to Yield Right-of-Way60236.0%
 
Backing Unsafely40334.0%
 
Passing or Lane Usage Improper39794.0%
 
Passing Too Closely36763.7%
 
Other Vehicular30713.1%
 
Unsafe Lane Changing25882.6%
 
Unsafe Speed24472.4%
 
Other values (44)1542415.4%
 
2020-12-09T12:05:48.637223image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique2 ?
Unique (%)< 0.1%
2020-12-09T12:05:48.843407image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length53
Median length21
Mean length21.15973
Min length3
Distinct47
Distinct (%)0.1%
Missing19243
Missing (%)19.2%
Memory size781.2 KiB
Unspecified
67739 
Driver Inattention/Distraction
 
5284
Following Too Closely
 
1296
Other Vehicular
 
1249
Passing or Lane Usage Improper
 
802
Other values (42)
 
4387
ValueCountFrequency (%) 
Unspecified6773967.7%
 
Driver Inattention/Distraction52845.3%
 
Following Too Closely12961.3%
 
Other Vehicular12491.2%
 
Passing or Lane Usage Improper8020.8%
 
Failure to Yield Right-of-Way7160.7%
 
Passing Too Closely5380.5%
 
Unsafe Lane Changing4020.4%
 
Unsafe Speed3830.4%
 
Traffic Control Disregarded3700.4%
 
Other values (37)19782.0%
 
(Missing)1924319.2%
 
2020-12-09T12:05:49.157238image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique2 ?
Unique (%)< 0.1%
2020-12-09T12:05:49.366553image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length53
Median length11
Mean length11.33302
Min length3
Distinct30
Distinct (%)0.3%
Missing91239
Missing (%)91.2%
Memory size781.2 KiB
Unspecified
8197 
Following Too Closely
 
176
Other Vehicular
 
171
Driver Inattention/Distraction
 
118
Reaction to Uninvolved Vehicle
 
16
Other values (25)
 
83
ValueCountFrequency (%) 
Unspecified81978.2%
 
Following Too Closely1760.2%
 
Other Vehicular1710.2%
 
Driver Inattention/Distraction1180.1%
 
Reaction to Uninvolved Vehicle16< 0.1%
 
Unsafe Speed15< 0.1%
 
Pavement Slippery12< 0.1%
 
Passing or Lane Usage Improper5< 0.1%
 
Driver Inexperience5< 0.1%
 
Driverless/Runaway Vehicle4< 0.1%
 
Other values (20)42< 0.1%
 
(Missing)9123991.2%
 
2020-12-09T12:05:49.563657image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique7 ?
Unique (%)0.1%
2020-12-09T12:05:49.744745image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length53
Median length3
Mean length3.75874
Min length3
Distinct12
Distinct (%)0.5%
Missing97760
Missing (%)97.8%
Memory size781.2 KiB
Unspecified
2107 
Other Vehicular
 
54
Following Too Closely
 
41
Driver Inattention/Distraction
 
22
Pavement Slippery
 
4
Other values (7)
 
12
ValueCountFrequency (%) 
Unspecified21072.1%
 
Other Vehicular540.1%
 
Following Too Closely41< 0.1%
 
Driver Inattention/Distraction22< 0.1%
 
Pavement Slippery4< 0.1%
 
Reaction to Uninvolved Vehicle3< 0.1%
 
Unsafe Speed3< 0.1%
 
Aggressive Driving/Road Rage2< 0.1%
 
Obstruction/Debris1< 0.1%
 
Outside Car Distraction1< 0.1%
 
Other values (2)2< 0.1%
 
(Missing)9776097.8%
 
2020-12-09T12:05:49.912224image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique4 ?
Unique (%)0.2%
2020-12-09T12:05:50.078956image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length30
Median length3
Mean length3.19114
Min length3
Distinct9
Distinct (%)1.3%
Missing99333
Missing (%)99.3%
Memory size781.2 KiB
Unspecified
622 
Other Vehicular
 
24
Following Too Closely
 
12
Driver Inattention/Distraction
 
3
Pavement Slippery
 
2
Other values (4)
 
4
ValueCountFrequency (%) 
Unspecified6220.6%
 
Other Vehicular24< 0.1%
 
Following Too Closely12< 0.1%
 
Driver Inattention/Distraction3< 0.1%
 
Pavement Slippery2< 0.1%
 
Passing Too Closely1< 0.1%
 
Reaction to Uninvolved Vehicle1< 0.1%
 
Obstruction/Debris1< 0.1%
 
Unsafe Speed1< 0.1%
 
(Missing)9933399.3%
 
2020-12-09T12:05:50.264818image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique4 ?
Unique (%)0.6%
2020-12-09T12:05:50.401441image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:50.626578image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length30
Median length3
Mean length3.05656
Min length3

collision_id
Real number (ℝ≥0)

UNIQUE

Distinct100000
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4226109.341
Minimum2568
Maximum4353706
Zeros0
Zeros (%)0.0%
Memory size781.2 KiB
2020-12-09T12:05:50.859601image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum2568
5-th percentile3665427.95
Q14182342.75
median4300224
Q34328315.25
95-th percentile4348345.05
Maximum4353706
Range4351138
Interquartile range (IQR)145972.5

Descriptive statistics

Standard deviation165356.0511
Coefficient of variation (CV)0.03912725341
Kurtosis45.22161792
Mean4226109.341
Median Absolute Deviation (MAD)51882.5
Skewness-3.965406795
Sum4.226109341e+11
Variance2.734262364e+10
MonotocityNot monotonic
2020-12-09T12:05:51.063109image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%) 
41963511< 0.1%
 
41968841< 0.1%
 
43039711< 0.1%
 
43357981< 0.1%
 
41599941< 0.1%
 
43219181< 0.1%
 
41678761< 0.1%
 
43108321< 0.1%
 
43187271< 0.1%
 
43191721< 0.1%
 
Other values (99990)99990> 99.9%
 
ValueCountFrequency (%) 
25681< 0.1%
 
690101< 0.1%
 
742941< 0.1%
 
1277331< 0.1%
 
2105911< 0.1%
 
ValueCountFrequency (%) 
43537061< 0.1%
 
43537051< 0.1%
 
43537011< 0.1%
 
43536721< 0.1%
 
43536631< 0.1%
 

vehicle_type_code1
Categorical

HIGH CARDINALITY

Distinct366
Distinct (%)0.4%
Missing740
Missing (%)0.7%
Memory size781.2 KiB
Sedan
46790 
Station Wagon/Sport Utility Vehicle
35766 
Taxi
 
3478
Pick-up Truck
 
2615
Box Truck
 
1946
Other values (361)
8665 
ValueCountFrequency (%) 
Sedan4679046.8%
 
Station Wagon/Sport Utility Vehicle3576635.8%
 
Taxi34783.5%
 
Pick-up Truck26152.6%
 
Box Truck19461.9%
 
Bike14371.4%
 
Bus11251.1%
 
Motorcycle9210.9%
 
Tractor Truck Diesel7510.8%
 
Van5720.6%
 
Other values (356)38593.9%
 
(Missing)7400.7%
 
2020-12-09T12:05:51.295070image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique225 ?
Unique (%)0.2%
2020-12-09T12:05:51.508922image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length38
Median length5
Mean length16.23289
Min length1

vehicle_type_code2
Categorical

HIGH CARDINALITY
MISSING

Distinct385
Distinct (%)0.5%
Missing26589
Missing (%)26.6%
Memory size781.2 KiB
Sedan
31369 
Station Wagon/Sport Utility Vehicle
24773 
Bike
3586 
Taxi
 
2300
Pick-up Truck
 
2282
Other values (380)
9101 
ValueCountFrequency (%) 
Sedan3136931.4%
 
Station Wagon/Sport Utility Vehicle2477324.8%
 
Bike35863.6%
 
Taxi23002.3%
 
Pick-up Truck22822.3%
 
Box Truck21462.1%
 
Bus10111.0%
 
Tractor Truck Diesel7630.8%
 
Motorcycle7310.7%
 
Van5370.5%
 
Other values (375)39133.9%
 
(Missing)2658926.6%
 
2020-12-09T12:05:51.733927image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique218 ?
Unique (%)0.3%
2020-12-09T12:05:51.972023image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length38
Median length5
Mean length12.37042
Min length2

vehicle_type_code_3
Categorical

HIGH CARDINALITY
MISSING

Distinct64
Distinct (%)0.8%
Missing91671
Missing (%)91.7%
Memory size781.2 KiB
Sedan
4129 
Station Wagon/Sport Utility Vehicle
3380 
Pick-up Truck
 
195
Taxi
 
187
Box Truck
 
73
Other values (59)
 
365
ValueCountFrequency (%) 
Sedan41294.1%
 
Station Wagon/Sport Utility Vehicle33803.4%
 
Pick-up Truck1950.2%
 
Taxi1870.2%
 
Box Truck730.1%
 
Motorcycle46< 0.1%
 
Bike46< 0.1%
 
Bus41< 0.1%
 
Van40< 0.1%
 
Tractor Truck Diesel37< 0.1%
 
Other values (54)1550.2%
 
(Missing)9167191.7%
 
2020-12-09T12:05:52.213283image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique32 ?
Unique (%)0.4%
2020-12-09T12:05:52.547226image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length35
Median length3
Mean length4.20942
Min length2

vehicle_type_code_4
Categorical

MISSING

Distinct31
Distinct (%)1.4%
Missing97853
Missing (%)97.9%
Memory size781.2 KiB
Sedan
1143 
Station Wagon/Sport Utility Vehicle
847 
Pick-up Truck
 
49
Taxi
 
32
Box Truck
 
12
Other values (26)
 
64
ValueCountFrequency (%) 
Sedan11431.1%
 
Station Wagon/Sport Utility Vehicle8470.8%
 
Pick-up Truck49< 0.1%
 
Taxi32< 0.1%
 
Box Truck12< 0.1%
 
Convertible10< 0.1%
 
Bus10< 0.1%
 
Motorcycle7< 0.1%
 
Dump5< 0.1%
 
Van4< 0.1%
 
Other values (21)28< 0.1%
 
(Missing)9785397.9%
 
2020-12-09T12:05:52.743816image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique16 ?
Unique (%)0.7%
2020-12-09T12:05:52.934791image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length35
Median length3
Mean length3.30334
Min length3

vehicle_type_code_5
Categorical

MISSING

Distinct18
Distinct (%)2.8%
Missing99354
Missing (%)99.4%
Memory size781.2 KiB
Sedan
326 
Station Wagon/Sport Utility Vehicle
263 
Pick-up Truck
 
23
Taxi
 
6
Box Truck
 
4
Other values (13)
 
24
ValueCountFrequency (%) 
Sedan3260.3%
 
Station Wagon/Sport Utility Vehicle2630.3%
 
Pick-up Truck23< 0.1%
 
Taxi6< 0.1%
 
Box Truck4< 0.1%
 
Van4< 0.1%
 
Motorcycle4< 0.1%
 
PK3< 0.1%
 
Convertible3< 0.1%
 
Bus2< 0.1%
 
Other values (8)8< 0.1%
 
(Missing)9935499.4%
 
2020-12-09T12:05:53.126411image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Frequencies of value counts

Unique

Unique8 ?
Unique (%)1.2%
2020-12-09T12:05:53.308705image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category

Length

Max length35
Median length3
Mean length3.0943
Min length2
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size97.7 KiB
True
55394 
False
44606 
ValueCountFrequency (%) 
True5539455.4%
 
False4460644.6%
 
2020-12-09T12:05:53.425362image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Interactions

2020-12-09T12:05:27.185590image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:27.391646image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:27.566881image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:27.759267image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:27.949830image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:28.136036image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:28.326568image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:28.501666image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:28.675797image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:28.832030image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:29.003555image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:29.173937image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:29.346498image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:29.517654image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:29.674248image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:29.864664image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:30.141825image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:30.327691image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:30.514097image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:30.699921image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:30.888609image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:31.063498image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:31.252278image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:31.425435image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:31.616638image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:31.804789image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:31.992798image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:32.179015image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:32.352674image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:32.537966image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:32.709338image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:32.894824image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:33.079798image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:33.261921image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:33.446215image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:33.616352image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:33.807566image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:33.984049image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:34.172725image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:34.363673image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:34.549065image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:34.738250image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:35.025935image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:35.226915image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:35.406426image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:35.592177image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:35.783896image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:35.959352image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:36.130922image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2020-12-09T12:05:53.549394image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2020-12-09T12:05:53.891416image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2020-12-09T12:05:54.255996image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2020-12-09T12:05:54.627343image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2020-12-09T12:05:55.067866image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2020-12-09T12:05:36.741545image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:38.844733image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:40.445513image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2020-12-09T12:05:41.170840image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Sample

First rows

crash_datecrash_timeboroughzip_codelatitudelongitudelocationon_street_nameoff_street_namecross_street_namenumber_of_persons_injurednumber_of_persons_killednumber_of_pedestrians_injurednumber_of_pedestrians_killednumber_of_cyclist_injurednumber_of_cyclist_killednumber_of_motorist_injurednumber_of_motorist_killedcontributing_factor_vehicle_1contributing_factor_vehicle_2contributing_factor_vehicle_3contributing_factor_vehicle_4contributing_factor_vehicle_5collision_idvehicle_type_code1vehicle_type_code2vehicle_type_code_3vehicle_type_code_4vehicle_type_code_5duplicated_location
02017-04-18T00:00:00.00023:10STATEN ISLAND10312.040.536728-74.193344(40.536728, -74.193344)NaNNaN243 DARLINGTON AVENUE00000000Driver Inattention/DistractionUnspecifiedNaNNaNNaN3654181Station Wagon/Sport Utility VehicleNaNNaNNaNNaNFalse
12017-05-06T00:00:00.00013:00BRONX10472.040.829052-73.850380(40.829052, -73.85038)CASTLE HILL AVENUEBLACKROCK AVENUENaN10100000Failure to Yield Right-of-WayNaNNaNNaNNaN3665311SedanNaNNaNNaNNaNFalse
22017-04-27T00:00:00.00017:15QUEENS11420.040.677303-73.804565(40.677303, -73.804565)135 STREETFOCH BOULEVARDNaN00000000Driver Inattention/DistractionUnspecifiedNaNNaNNaN3658491SedanSedanNaNNaNNaNFalse
32017-05-09T00:00:00.00020:10NaNNaN40.624958-74.145775(40.624958, -74.145775)FOREST AVENUERICHMOND AVENUENaN10000010UnspecifiedUnspecifiedUnspecifiedNaNNaN3666554MotorcycleSedanBusNaNNaNFalse
42017-04-18T00:00:00.00014:00BRONX10456.040.828846-73.903120(40.828846, -73.90312)NaNNaN1167 BOSTON ROAD00000000Driver Inattention/DistractionUnspecifiedNaNNaNNaN3653269SedanStation Wagon/Sport Utility VehicleNaNNaNNaNFalse
52017-05-08T00:00:00.00010:33NaNNaN40.556454-74.207770(40.556454, -74.20777)WEST SHORE EXPRESSWAYNaNNaN00000000Unsafe Lane ChangingUnspecifiedNaNNaNNaN3666365SedanSedanNaNNaNNaNFalse
62017-05-10T00:00:00.0006:10NaNNaN40.740025-73.976260(40.740025, -73.97626)1 AVENUEEAST 28 STREETNaN00000000Passing or Lane Usage ImproperUnspecifiedNaNNaNNaN3666842TaxiBox TruckNaNNaNNaNFalse
72017-04-24T00:00:00.0009:30BROOKLYN11203.040.651646-73.932330(40.651646, -73.93233)EAST 48 STREETCHURCH AVENUENaN00000000Other VehicularOther VehicularNaNNaNNaN3657123Station Wagon/Sport Utility VehicleStation Wagon/Sport Utility VehicleNaNNaNNaNFalse
82017-04-14T00:00:00.00013:00NaNNaN40.751800-73.817314(40.7518, -73.817314)ROBINSON STREETNaNNaN00000000Passing Too CloselyUnspecifiedNaNNaNNaN3651039SedanStation Wagon/Sport Utility VehicleNaNNaNNaNFalse
92017-05-02T00:00:00.0001:00BRONX10474.040.816864-73.882744(40.816864, -73.882744)NaNNaN772 EDGEWATER ROAD00000000UnspecifiedNaNNaNNaNNaN3661896Pick-up TruckNaNNaNNaNNaNFalse

Last rows

crash_datecrash_timeboroughzip_codelatitudelongitudelocationon_street_nameoff_street_namecross_street_namenumber_of_persons_injurednumber_of_persons_killednumber_of_pedestrians_injurednumber_of_pedestrians_killednumber_of_cyclist_injurednumber_of_cyclist_killednumber_of_motorist_injurednumber_of_motorist_killedcontributing_factor_vehicle_1contributing_factor_vehicle_2contributing_factor_vehicle_3contributing_factor_vehicle_4contributing_factor_vehicle_5collision_idvehicle_type_code1vehicle_type_code2vehicle_type_code_3vehicle_type_code_4vehicle_type_code_5duplicated_location
999902019-11-08T00:00:00.00019:20BROOKLYN11218.0NaNNaNNaNOCEAN PARKWAYAVENUE CNaN00000000UnspecifiedUnspecifiedNaNNaNNaN4238828SedanSedanNaNNaNNaNTrue
999912019-11-11T00:00:00.00015:55NaNNaN40.661540-73.982740(40.66154, -73.98274)16 STREETNaNNaN00000000Outside Car DistractionUnspecifiedNaNNaNNaN4239244SedanStation Wagon/Sport Utility VehicleNaNNaNNaNTrue
999922019-11-13T00:00:00.00011:00BRONX10461.040.836597-73.840546(40.836597, -73.840546)NaNNaN1332 COMMERCE AVENUE00000000UnspecifiedUnspecifiedNaNNaNNaN4240499Pick-up TruckSedanNaNNaNNaNTrue
999932019-12-04T00:00:00.0007:00QUEENS11385.040.703407-73.883484(40.703407, -73.883484)NaNNaN71-17 69 STREET00000000Passing Too CloselyUnspecifiedNaNNaNNaN4252028Station Wagon/Sport Utility VehicleNaNNaNNaNNaNFalse
999942019-11-15T00:00:00.00013:05BROOKLYN11206.040.701862-73.943830(40.701862, -73.94383)WHIPPLE STREETBROADWAYNaN00000000Following Too CloselyUnspecifiedNaNNaNNaN4242657SedanStation Wagon/Sport Utility VehicleNaNNaNNaNTrue
999952019-11-20T00:00:00.00015:00BROOKLYN11210.040.618893-73.946420(40.618893, -73.94642)NaNNaN1314 EAST 29 STREET00000000UnspecifiedNaNNaNNaNNaN4244961Station Wagon/Sport Utility VehicleNaNNaNNaNNaNFalse
999962019-12-01T00:00:00.00011:22QUEENS11367.040.723380-73.814750(40.72338, -73.81475)NaNNaN150-62 76 ROAD00000000UnspecifiedUnspecifiedNaNNaNNaN4250093Station Wagon/Sport Utility VehicleStation Wagon/Sport Utility VehicleNaNNaNNaNFalse
999972019-11-21T00:00:00.00021:30BROOKLYN11249.040.710820-73.968530(40.71082, -73.96853)BROADWAYKENT AVENUENaN00000000Passing Too CloselyUnspecifiedNaNNaNNaN4245290SedanBox TruckNaNNaNNaNFalse
999982019-11-18T00:00:00.00017:28BROOKLYN11234.040.631180-73.928185(40.63118, -73.928185)NaNNaN1695 UTICA AVENUE00000000Driver Inattention/DistractionUnspecifiedNaNNaNNaN4243646SedanBusNaNNaNNaNFalse
999992019-11-17T00:00:00.00020:42MANHATTAN10017.040.750760-73.968430(40.75076, -73.96843)EAST 45 STREET1 AVENUENaN00000000Driver Inattention/DistractionDriver Inattention/DistractionNaNNaNNaN4247517SedanStation Wagon/Sport Utility VehicleNaNNaNNaNTrue